Image Provenance Analysis at Scale
نویسندگان
چکیده
Prior art has shown it is possible to estimate, through image processing and computer vision techniques, the types and parameters of transformations that have been applied to the content of individual images to obtain new images. Given a large corpus of images and a query image, an interesting further step is to retrieve the set of original images whose content is present in the query image, as well as the detailed sequences of transformations that yield the query image given the original images. This is a problem that recently has received the name of image provenance analysis. In these times of public media manipulation (e.g., fake news and meme sharing), obtaining the history of image transformations is relevant for fact checking and authorship verification, among many other applications. This article presents an end-to-end processing pipeline for image provenance analysis, which works at real-world scale. It employs a cutting-edge image filtering solution that is custom-tailored for the problem at hand, as well as novel techniques for obtaining the provenance graph that expresses how the images, as nodes, are ancestrally connected. A comprehensive set of experiments for each stage of the pipeline is provided, comparing the proposed solution with state-of-the-art results, employing previously published datasets. In addition, this work introduces a new dataset of real-world provenance cases from the social media site Reddit, along with baseline results. Keywords—Digital Image Forensics, Digital Humanities, Image Retrieval, Graphs, Image Provenance, Image Phylogeny
منابع مشابه
Ontology-Driven Provenance Management in eScience: An Application in Parasite Research
Provenance, from the French word “provenir”, describes the lineage or history of a data entity. Provenance is critical information in scientific applications to verify experiment process, validate data quality and associate trust values with scientific results. Current industrial scale eScience projects require an end-to-end provenance management infrastructure. This infrastructure needs to be ...
متن کاملEnabling Provenance on Large Scale e-Science Applications
Large-scale e-Science experiments present unprecedented data handling requirements with their multi-petabyte data storages. Complex software applications, such as the ATLAS High Energy Physics experiment at CERN, run throughout Grid computing sites around the world in a distributed environment, with scientists performing concurrent analysis on data and producing new data products shared among t...
متن کاملImage provenance inference through content-based device fingerprint analysis
The last few decades have witnessed the increasing popularity of low-cost and highquality digital imaging devices ranging from digital cameras to cellphones with builtin cameras, which makes the acquisition of digital images become easier than ever before. Meanwhile, the ever-increasing convenience of image acquisition has bred the pervasiveness of powerful image editing tools, which allow even...
متن کاملThe Application of Cloud Computing to the Creation of Image Mosaics and Management of Their Provenance
We have used the Montage image mosaic engine to investigate the cost and perfonnance of processing images on the Amazon EC2 cloud, and to infonn the requirements that higher-level products impose on provenance management technologies. We will present a detailed comparison of the perfonnance of Montage on the cloud and on the Abe high perfomlance cluster at the ational Center for Supercomputing ...
متن کاملIntelligent Workflow Systems and Provenance-Aware Software
Workflows are increasingly used in science to manage complex computations and data processing at large scale. Intelligent workflow systems provide assistance in setting up parameters and data, validating workflows created by users, and automating the generation of workflows from high-level user guidance. These systems use semantic workflows that extend workflow representations with semantic con...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.06510 شماره
صفحات -
تاریخ انتشار 2018